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ABSTRACT 

This  paper  discusses  the  cooperative  research  and 
development  work  between  Delphi  Automotive  Systems 
(formerly  known  as  Delphi  Interior  &  Lighting  Systems) 
and  the  U.S.  Army  TACOM/TARDEC.  Discussion  will 
focus  on  past  work  and  the  evolution  of  the  approaches 
currently  being  used  by  TARDEC  and  Delphi  for  digital 
human  animation,  real-time  human  interaction  (man  in 
the  loop),  and  motion  data  library  development,  as  it 
relates  to  TACOM/Delphi’s  electromagnetic  motion 
capture  systems  and  the  Engineering  Animation 
Incorporated  (EAI)-Jack  ergonomic  analysis/animation 
software. 

INTRODUCTION 

TACOM  and  Delphi  Automotive  Systems  have  been 
working  together  on  a  joint  research  effort  where  the  use 
of  motion  capture  hardware  has  been  applied  towards 
the  scripting  of  anthropomorphically  correct  digital 
humans  to  evaluate  digital  mock-ups  in  a  computer 
generated  synthetic  environment.  The  focus  of  this 
research  was  to  address  the  lack  of  digital  human 
capabilities  that  are  necessary  for  an  operator, 
maintainer  or  trainer  to  accurately  evaluate  the 
performance  requirements  (i.e. ,  human  factors, 
maintainability,  etc.)  of  a  digital  mock-up  using  today’s 
state-of-the-art  modeling  and  simulation  tools. 

The  system  used  by  Delphi  consists  of  a  one-wall  CAVE 
(CAVE  Automatic  Virtual  Environment),  an  Ascension 
motionstar  electromagnetic  motion  tracker,  and  EAI-Jack 
ergonomic  simulation/analysis  software.  At  the  U.S. 
Army  TACOM,  the  systems  include  a  Virtual  Research 
V8  Head  Mounted  Display  (HMD),  an  Ascension  Flock  of 
Birds  (FOB),  motion  tracker,  including  10  position 
sensors  (i.e.  birds),  and  the  EAI-Jack  software. 

Initial  discussion  will  focus  on  the  electromagnetic  motion 
capture  hardware  setup  and  configuration  at  the  two 
organizations.  The  problems  with  electromagnetic 
devices  associated  with  these  specific  configurations, 
discovered,  as  part  of  this  joint  research  will  be 
discussed.  For  example,  the  differences  in  the  optional 


Serial  vs.  Ethernet  configurations  for  communicating  with 
the  motion  tracker  can  be  a  source  of  problems.  This 
paper  will  then  focus  on  the  digital  human,  the  sites  used 
to  constrain/control  the  human,  the  minimum  number  of 
sites  required  to  generate  motion  data  files,  the  site 
orientation/position/offset  (i.e.,  sites  selected  on  the 
digital  human  constrained  by  the  position  sensor  which 
define  the  digital  human’s  motion),  and  the  pros  and  cons 
of  using  a  minimum  number  of  position  sensors.  Finally 
this  paper  will  discuss  the  motion  data  file  generated  as  a 
result  of  a  motion  capture  session  (i.e.,  the  types  of  files 
that  can  be  generated,  and  the  associated  motion  data 
format.).  The  conclusion  will  address  areas  that  need 
future  development  and  new  trends  in  motion  tracking. 

DIGITAL  HUMANS  AND  ELECTROMAGNETIC 
MOTION  CAPTURE 

THE  FOCUS  -  Manual  scripting  of  digital  human  avatars 
has  been,  and  still  is  the  norm,  when  developing 
animations  of  maintenance,  manufacturing,  and 
automotive  cockpit  operations,  for  both  TACOM  and 
Delphi.  The  problem  associated  with  this  approach  is  that 
the  digital  human  motion  developed  through  manual 
scripting,  is  an  approximation  of  actual  human  motions 
required  to  accomplish  the  task  and  results  in  rigid,  robot 
like  interactions  of  the  digital  human  with  the  virtual 
environment.  This  scripting  is  very  time  consuming,  with 
the  time  required  to  develop  an  animation  increasing  as 
the  number  of  the  human  joints  involved  increases. 
These  animations  are  able  to  satisfy  the  initial,  or  base 
requirement,  of  digitally  imitating  rough  human 
interaction  within  a  virtual  world,  but  in  order  to  fully 
analyze  the  human  factors  effects  of  an  entirely  digital 
concept  on  the  human  as  the  end  user,  greater  accuracy 
is  needed.  How  the  digital  human  interacts  with  the 
digital  model,  needs  to  mimic  the  actual  motions  that 
would  be  used  in  the  real  world,  to  analyze  such  human 
aspects  as  reach,  posture,  lifting,  comfort,  vision,  in  the 
digital  development  of  the  human  interface. 

ELECTROMAGNETIC  MOTION  CAPTURE  -  There  are 
basically  two  types  of  electromagnetic  motion  capture 
systems  available  today:  the  AC  magnetic  currently 
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produced  by  Polhemus,  and  the  pulsed  DC  sold  by 
Ascension.  The  discussion  will  focus  on  the  use  of  the 
Ascension  system  with  the  EAI-Jack  software 
environment  as  it  relates  to  this  joint  research  program. 
Basically  an  electromagnetic  motion  capture  system 
consists  of  a  transmitter,  a  transmitter  controller,  N 
number  of  sensors,  and  a  Central  Processing  Unit  (CPU) 
(figure  1).  The  CPU  calculates  the  position  and 
orientation  of  the  sensor  or  bird,  based  on  the  bird’s 
reaction  to  the  electromagnetic  field  generated  by  the 
transmitter,  and  then  passes  this  data  on  to  the  host 
computer  where  the  motion  data  is  being  displayed. 


Figure  1 .  Ascension  MotionStar  and  Flock  of 
Birds  systems 


The  Ascension  Extended  Range  Transmitter  (ERT)  is  a 
one-foot  square  black  box,  and  generates  an  8  to  10  foot 
radius  electromagnetic  field,  that  could  be  categorized  as 
a  hemispherical  region  where  motion  capture  occurs 
(figure  2). 


Figure  2.  Lower  Electromagnetic  ERT  field  hemisphere 


There  are  six  possible  hemispheres  that  you  can  select 
for  your  motion  capture  region,  and  this  will  control  how 
you  place  your  transmitter  (i.e.,  forward,  rear,  right,  left, 
upper  and  lower).  The  default  hemisphere  for  motion 
capture  within  the  EAI-Jack  environment  is  the  front 
hemisphere  (figure  3).  The  front  region  should  be  an 
adequate  configuration  for  most  motion  capture  sessions. 
Until  recently,  there  were  two  modes  of  operation  for  the 
transmitter:  normal  and  extended  addressing,  with  the 
recent  addition  of  the  super-expanded  addressing.  A 
little  known  fact  about  the  first  two  modes  is  the  mode’s 
effect  on  the  electromagnetic  field  strength.  Under 
normal  addressing  mode,  the  electromagnetic  field 
strength  is  varied  with  respect  to  the  proximity  of  the 
closest  sensor  and  allows  the  bird  to  come  right  up  to  the 
transmitter.  The  problem  is  that  the  field  strength  is 


weaker  for  the  birds  farthest  away.  For  the  expanded 
addressing  mode,  the  field  strength  is  a  constant  but  the 
sensor  cannot  get  closer  than  a  two-foot  radius. 


Figure  3.  Forward  hemisphere  configuration 

Field  Integrity  and  Virtual  World  Interaction  -  There  are 
many  factors  that  can  affect  the  field  integrity  of  an 
electromagnetic  motion  capture  system.  The  most 
obvious  is  the  presence  of  ferrous  metal  within  the  field 
envelope.  As  a  result  of  this  handicap,  physical  mock- 
ups  must  be  fabricated  using  alternative  materials,  like 
wood,  composites,  or  nonferrous  metals.  These  mock- 
ups  are  only  necessary  when  attempting  to  develop 
motion  files  for  a  specific  process  that  will  later  be  used 
as  a  basis  for  a  digital  human  factor  case  study.  Another 
alternative  is  to  digitally  mock  up  the  devices  in  a  virtual 
world  and  use  an  HMD  to  view  the  virtual  environment. 
The  problem  with  this  scenario  is  that  it  is  not  possible  to 
realistically  interact  with  (i.e.,  grab  and  release),  the 
virtual  objects  associated  with  the  process  being  studied. 
One  approach  that  has  been  used  successfully  consists 
of  the  instrumentation  of  the  actual  object  with  a  position 
sensor  that  constrains  the  digital  model  with  the 
associated  sensor  icon,  such  that  the  position  and 
orientation  in  the  virtual  world  mimics  that  of  the  real 
world.  By  paying  close  attention  to  the  placement  of  the 
electromagnetic  transmitter  and  its  relationship  to  the 
objects  in  the  virtual  world,  one  can  minimize  distortion, 
resulting  in  extremely  accurate  motion  capture  data. 
Other  factors  that  can  affect  field  integrity  are:  the 
proximity  of  fluorescent  lighting  to  the  transmitter,  power 
distribution  boxes  within  the  field,  steel-reinforcing  rods  in 
the  floor  if  it  is  a  concrete  slab,  and  performing  motion 
capture  at  the  limits  of  the  electromagnetic  field.  An 
example  of  where  you  would  see  problems  with  the 
electromagnetic  field  limit  would  be  with  a  CAVE  (figure 
4),  where  the  transmitter  is  elevated  as  a  result  of  system 
requirements. 
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Figure  4.  Electromagnetic  field  within  a  CAVE 
environment 


The  major  problem  with  this  configuration  is  that  the 
lower  human  extremities  are  already  at  the  field  limits, 
and  as  you  move  out  from  under  the  transmitter,  the 
sensors  are  outside  the  maximum  radius  for  the  field. 
Factors  like  steel-reinforcing  rods  can  be  a  major  factor 
for  field  distortion,  for  these  configurations,  when  built 
directly  on  concrete  slabs.  Distortions  in  actual  position 
have  been  observed  (figure  5),  for  this  type  of 
configuration,  where  the  birds  drift  to  one  side  as  you 
move  from  the  head  down  to  the  feet. 


Figure  5.  Sensor  position  distortion  at  the  feet  closest 
to  the  field  radius 

Using  an  elevated  floor  with  a  two-foot  height,  when 
possible  and  using  the  materials  mentioned  earlier  can 
minimize  this  distortion.  Also,  selecting  areas  for  motion 
capture  with  high  bay  ceilings  can  reduce  the  effects  of 
fluorescent  lighting.  Another  option  is  just  turning  the 
lights  off  over  the  transmitter. 


IMMERSION -By  using  the  resources  at  the  two 
organizations  it  was  possible  to  evaluate  the  differences 
in  viewing  the  digital  human  in  a  CAVE  configuration,  as 
opposed  to  that  of  an  HMD.  It  needs  to  be  mentioned 
here  that  EAI-Jack  does  not  yet  run  in  a  CAVE,  but  the 
Delphi  system  allowed  the  evaluation  of  motion  capture  in 
this  type  of  motion  capture  configuration,  by  using  the 
back  wall  as  a  large  monitor. 

CAVE  Automatic  Virtual  Environment  -  As  mentioned 
previously,  the  CAVE  configuration  has  problems  with 
distortion  of  the  position  and  orientation  for  the  sensors 
closest  to  the  maximum  radius  of  the  electromagnetic 
field.  These  problems  can  be  ignored  when  the  motion 
capture  scenarios  are  only  concerned  with  the  digital 
humans’  upper  torso,  as  would  be  the  case  for  processes 
where  the  digital  humans’  tasks  were  fixed  at  one 
location  or  station.  When  viewing  a  task  in  a  CAVE 
configuration,  there  are  two  potential  modes  of  operation. 
The  first  mode  is  that  of  first  person  perspective.  Virtual 
objects  in  this  mode  appear  life  size  and  are  positioned 
around  you,  as  you  would  expect  in  the  actual 
environment.  The  problem  is  the  alignment  of  the  digital 
human  appendages,  as  in  putting  on  a  virtual  suit.  One 
solution  is  to  turn  the  digital  human  display  off,  and 
capture  the  motion  for  later  playback/analysis.  The  next 
mode  is  that  of  viewing  the  virtual  world  from  a  third 
person  perspective.  This  approach  has  been  the  primary 
method  used  during  motion  capture  sessions  onsite  at 
Delphi.  The  benefit  here  is  in  the  ability  to  step  back  from 
the  scene  and  view  the  digital  environment  as  a  whole 
during  a  capture  session.  By  looking  at  the  virtual 
environment  in  this  manner  it  is  immediately  possible  to 
judge  the  success  or  failure  of  the  capture  session,  and 
visually  identify  problem  areas  with  the  motion  data.  This 
mode  has  the  added  benefit  of  allowing  multiple  people 
to  view  previously  recorded  sessions  for  case  studies  like 
a  design  review,  or  Integrated  Process/Product  Teams 
(IPT). 

Head  Mounted  Display  -  The  use  of  an  HMD  (figure  6) 
provides  the  ability  to  literally  look  at  the  world  through 
Jack’s  eyes.  In  order  to  accomplish  this  you  may  use  the 
Multi  Channel  Option  (MCO)  installed  for  a  Silicon 
Graphics  (SGI)  ONYX,  or  have  the  newer  SGI  ONYX-2. 
The  SGI  ONYX-2  provides  the  ability  to  selectively 
identify  and  output  regions  of  the  monitor  to  specific 
graphic  ports.  Having  met  these  constraints,  cameras 
can  be  attached  to  sites  located  on  both  the  right  and  left 
eye  of  the  Jack  figure,  moving  with  the  Jack  figure  in  the 
virtual  environment,  generating  the  view  perspective  for 
each  eye  in  two  separate  windows.  By  controlling  the 
output  display  of  these  windows  to  the  different  projectors 
in  an  HMD,  realistic  stereo  images  can  be  generated, 
only  limited  by  the  resolution  of  the  HMD  used,  and  the 
texturing  capabilities  of  the  software. 
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Figure  6.  Virtual  Research  V8  Head  Mounted  Display 

Extreme  caution  needs  to  be  exercised  when  using  an 
HMD  for  a  full  body  motion  capture  session.  This  is 
primarily  due  to  the  potential  for  tripping  and/or  falling, 
resulting  from  the  tethers  used  to  attach  the  sensors  to 
the  CPU  catching  or  snagging  on  sharp  objects. 
Therefore,  the  use  of  an  HMD  is  recommended  for 
situations  where  the  operator  is  fixed  or  stationary,  as  in 
crew-station  evaluations. 

Communication:  Ethernet  vs.  Serial  in  EAI-Jack  -  The 
rate  at  which  you  can  capture  motion  data,  i.e.,  frames 
per  second  (fps),  is  directly  related  to  the  available  baud 
rate.  Under  the  Serial  configuration,  for  a  standard  Unix 
operating  system  the  maximum  baud  rate  for  the  Serial 
port  is  38,400  bits  per  second  (bps).  Frame  rates  around 
4  fps  have  been  seen  using  this  configuration  at  both 
TACOM  and  Delphi.  Comparing  this  to  the  standard 
frame  rate  for  animation  of  30  fps  one  can  see  that  a 
considerable  amount  of  data  is  being  dropped.  Ethernet 
has  the  potential  to  increase  the  frame  rate,  with  the  baud 
rate  limited  only  by  the  network  bandwidth.  This  has  not 
been  tested  due  to  hardware  constraints.  One  potential 
problem  is  that  by  increasing  the  capture  frame  rate  you 
also  increase  the  size  of  the  motion  data  file,  to  be 
discussed  later.  Also,  when  using  a  Serial  configuration 
within  the  EAI-Jack  environment,  Ascension  treats  the 
Transmitter  as  bird  1,  even  though  there  is  no  position 
sensor  attached  to  this  station.  In  Jack  this  can  cause 
confusion,  for  example  in  a  ten-bird  setup,  you  must 
specify  1 1  birds  in  the  FOB  Graphic  User  Interface  (GUI). 
What  this  means  is  that  birds  2  through  11  are  the  only 
birds  generating  position  and  orientation  data. 
Therefore,  care  needs  to  be  exercised  in  the  placement 
of  the  bird  on  the  digital  human,  making  sure  that  the  real 
bird  placement  corresponds  to  the  virtual  world 
placement.  This  problem  does  not  exist  with  the  Ethernet 
configuration  in  Jack. 

DIGITAL  HUMAN  SETUP  WITHIN  THE  VIRTUAL 
ENVIRONMENT 

Electromagnetic  Transmitter  Location  -  The  electro¬ 
magnetic  transmitter,  as  well  as  the  motion  sensors  (or 
birds),  explained  previously,  have  corresponding  icons 
representing  them  in  the  virtual  environment.  The  rela¬ 
tive  distance  between  the  physical  transmitter  and  the 
physical  motion  sensors  corresponds  to  the  relative  dis¬ 
tance  between  the  virtual  transmitter  icon  and  the  virtual 
motion  sensor  icons  in  the  virtual  environment.  When 
the  virtual  transmitter  icon  is  moved  in  the  virtual  environ¬ 


ment,  the  virtual  motion  sensors  also  move,  but  the  rela¬ 
tive  distance  between  them  remains  the  same.  The 
lower  portion  of  figure  7  shows  that  the  location  of  the 
electromagnetic  transmitter  (black  cube  in  lower  left) 
remains  fixed,  however,  the  virtual  electromagnetic  trans¬ 
mitter  icon  (red  cube  on  left  side  of  upper  portion  of  figure 
7)  can  be  moved  within  the  virtual  environment. 


Figure  7.  Sensor  Relationship  to  Transmitter, 
Real  vs.  Virtual 


This  movement  of  the  virtual  transmitter  icon  is  useful 
because  motion  capture  for  different  environments  can 
be  done  without  physically  moving  to  different  locations. 
The  virtual  transmitter  icon  can  be  moved  to  any  position 
in  the  virtual  environment  and  motion  capture  can  take 
place  because  the  virtual  sensor  (or  virtual  bird)  icons 
move  in  relation  with  the  virtual  transmitter.  In  this  case, 
when  the  physical  sensors  are  attached  to  the  object  to 
be  tracked,  the  virtual  sensors  are  in  the  appropriate 
location  in  the  virtual  environment,  as  determined  by  the 
location  of  the  virtual  electromagnetic  transmitter  icon. 
When  the  virtual  electromagnetic  transmitter  icon  is 
moved  to  another  location  within  the  virtual  environment, 
the  virtual  motion  sensor  icons,  or  bird  icons,  also  move 
to  their  respective  new  location.  In  the  case  of  motion 
capture,  the  digital  human  model  and  virtual  objects 
which  represent  the  physical  human  and  physical  parts 
being  tracked  with  the  motion  sensors  also  move  to  a 
new  location  within  the  virtual  environment.  This 
essentially  accomplishes  a  digital  human  movement  from 
one  location  in  the  virtual  world  to  another  location 
without  making  the  motion-tracked  human  move  in  the 
physical  world. 

As  an  example,  suppose  a  human  needs  to  work  on  a 
part  at  three  different  workstation  areas.  In  the  case 
where  motion  capture  is  used  to  track  a  human  and  a 
physical  part,  three  separate  physical  workstation  areas 
would  have  to  be  designated  to  accommodate  human 
movement  between  each  work  area.  If,  however,  we  take 
advantage  of  the  fact  that  the  virtual  electromagnetic 
transmitter  icon  can  be  moved  within  the  virtual 
environment,  then  each  motion  capture  session  at  each 
workstation  can  be  accomplished  without  physically 
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moving  in  the  real  world.  By  relocating  the  virtual 
transmitter  icon,  the  virtual  motion  sensors,  which  are 
attached  to  the  virtual  human  and  virtual  part,  also  move. 
Again,  maintaining  the  real  world  position  and  orientation 
relationships  of  the  transmitter  to  the  sensors.  The 
virtual  human  and  virtual  part  can  now  be  relocated  to 
each  workstation  area,  while  the  motion-tracked  human 
in  the  physical  world  stays  stationary.  The  physical 
human  can  now  perform  the  operations  on  the  physical 
part  while  standing  in  the  same  physical  location  for  all 
three  workstation  areas.  The  virtual  human  and  virtual 
parts  are  repositioned  at  each  of  the  three  virtual 
workstation  areas  simply  by  relocating  the  virtual 
transmitter  within  the  virtual  environment.  When  the 
motions  of  the  human  and  part  are  captured,  the  motions 
applied  to  the  virtual  human  and  virtual  part  in  the  virtual 
simulation  environment  take  place  at  the  desired 
locations.  In  this  example,  the  captured  motions,  applied 
to  the  virtual  human  and  virtual  part,  will  playback  at  each 
of  the  three  workstation  areas,  even  though  the  physical 
human  did  not  physically  move  to  another  workstation 
area  during  the  separate  motion  capture  sessions. 

Site  Position  &  Orientation.  Constraining  vs.  Constrained  - 
From  doing  simple  motion  capture,  in  EAI-Jack,  just 
controlling  the  head  for  viewing  the  virtual  world  using  an 
HMD,  to  generating  data  for  the  whole  digital  human 
body,  there  are  just  two  sites  to  be  concerned  with  for 
each  segment  controlled.  These  are  the  base  sites 
associated  with  each  segment,  the  base  site  for  the 
sensor  (Constraining)  and  the  base  site  associated  with 
the  controlled  segment  (Constrained).  For  example, 
figure  8  displays  both  the  base  site  for  the  FOB  sensor 
and  the  base  site  for  the  controlled  right  palm  segment. 


Figure  8.  Position  Sensor  (Constraining)  vs.  Digital 
Human  Hand  Site  (Constrained) 


Notice  that  the  coordinate  systems  for  both  sites  are 
identical  in  orientation.  If  this  were  an  actual  motion 
capture  session  the  position  for  both  would  also  be 
identical,  the  position  has  been  offset  for  the  purpose  of 
clarity.  The  key  point  here  is  that  the  site  controlling  each 
segment  should  duplicate  as  close  as  possible  the  actual 
relationship  of  the  attachment  of  the  FOB  sensor  on  the 


real  human  appendage  for  which  motion  data  is  being 
generated.  At  this  point  it  should  be  obvious  that  the 
sites  provided  on  the  Jack  digital  human  should  be 
adjusted  to  reflect  the  positioning  that  results  from  the 
different  harnesses  used  to  attach  the  sensors.  By  doing 
this  the  researcher  can  control  the  site  positions, 
specifying  key  ergonomic  markers,  increasing  data 
accuracy  and  worth  for  later  model  validation. 

Site  Placement.  How  Many  Sensors  Do  You  Need  -  The 
EAI-Jack  digital  human  currently  provides  11  sites  as 
attachment  points  for  motion  sensors  (figure  9),  with  the 
sensors  attached  as  per  the  discussion  in  the  previous 
section  (i.e.  two  on  the  feet,  two  on  the  knees,  one  on  the 
pelvis,  two  on  the  hands,  two  on  the  elbows,  one  on  the 
base  of  the  neck,  and  one  on  top  of  the  head).  All  of 
these  sites,  while  available,  do  not  have  to  be  used  when 
capturing  motion  data,  due  to  the  fact  that  Jack  has  the 
ability  to  generate  postures  for  those  segments  not 
constrained  by  sensors.  It  is  also  possible  to  create  more 
sites  to  be  used  during  a  capture  session,  but  at  present 
there  does  not  appear  to  be  a  need.  For  the  ten  sensor 
configurations  used  at  both  organizations,  a  minimum  of 
six  sensors  has  been  used  successfully.  By  attaching 
sensors  to  the  feet,  the  pelvis,  the  hands,  and  the  head, 
motion  data  files  have  been  captured  where  the  position 
data  for  each  segment  is  within  ±  3  inches,  for  those 
segments  not  constrained. 


Figure  9.  Digital  Human  Motion  Sensor  Attachment 
Points 
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While  these  postures  do  not  exactly  match  the  human 
motion,  it  has  been  close  enough  for  the  work  being 
done.  The  benefit  from  reducing  the  number  of  sensors 
used  to  capture  the  human  motion  is  that  these  sensors 
are  now  available  to  attach  to  physical  parts  where 
interaction  in  the  virtual  world  is  desired.  Attempts  have 
been  made  to  capture,  first,  the  human  motion  for  a 
specific  process  and/or  task,  then  the  object  motion  for 
the  part  being  manipulated,  and  then  combining  this 
motion  in  the  final  animation.  Anyone  that  has  ever 
attempted  this  can  verify  the  difficulty  of  duplicating 
three-dimensional  paths  during  two  separate  motion 
capture  sessions.  The  resulting  animation  is  like  a  race 
with  the  objects  leaving  and  arriving  at  different  times, 
speeding  up,  slowing  down,  while  following  slightly 
different  paths.  On  the  other  hand,  by  using  more 
sensors  to  control  the  digital  human  motion,  the 
calculations  required  to  determine  postures  are  reduced 
if  not  entirely  eliminated.  This  increases  the  accuracy  of 
the  digital  human  motion  data  and  reduces  the  latency 
seen  between  the  actual  digital  motions  as  compared  to 
the  real  motion.  Latency  is  the  difference  between  the 
digital  human  motion  and  the  real  human  subject’s 
motion,  when  viewed  simultaneously,  with  the  digital 
motion  lagging  behind  the  real  motion  by  a  fixed  amount. 
This  problem  tends  to  be  more  apparent  when  using  an 
HMD  configuration  over  that  of  a  CAVE.  Both 
approaches  have  their  benefits  and  the  decision  of  which 
approach  to  use  should  be  based  upon  the  ultimate  goal 
of  the  motion  capture  session.  The  ultimate  goal  for  a 
motion  capture  session  can  fall  into  two  possible 
categories:  either  greater  digital  human  motion  accuracy, 
required  for  human  factors  studies,  or  digital  human 
interaction  with  virtual  objects,  when  realistic  process 
animations  are  required. 

Controlling  the  Head  -  In  the  early  phase  of  the 
program,  problems  were  encountered  that  involved  how 
the  head  reacted  to  being  directly  constrained  by  a 
sensor.  Usually  the  reaction  of  the  head  produced 
postures  where  the  head  rotated  90  degrees  about  the 
vertical  or  lateral  (side  to  side)  axis.  Needless  to  say  the 
motion  generated  during  these  initial  sessions  was 
humorous  but  not  very  productive.  One  approach 
developed  as  an  alternative  to  constraining  the  head  was 
to  use  the  ability  to  force  the  Jack  figure  to  fixate  on  an 
object.  To  accomplish  this,  a  script  file  was  developed 
that  placed  an  object  in  space  relative  to  the  sensor  at  a 
specified  offset  in  front  of  the  Jack  figure,  attached  the 
object  to  the  sensor,  and  forced  the  figure  to  fixate  on  the 
object  (figure  10).  It  needs  to  be  mentioned  that  the 
sensor  that  the  cube  was  attached  to,  is  the  sensor 
attached  to  the  head,  thus  tracking  where  the  head  looks. 
This  approach  solved  all  problems  previously 
experienced  in  constraining  the  head  during  a  motion 
capture  session.  The  object  used  as  the  fixation  goal  can 
also  be  turned  on  and  off  during  a  motion  capture 
session,  providing  a  realistic  viewing  experience.  The 
resulting  head  motion  only  has  one  problem,  this  is  when 


the  subject  attempts  to  look  straight  down,  as  in  looking 
at  your  feet  in  a  standing  posture.  The  problem  seen  is 
that  the  head  tends  to  lean  on  the  right  shoulder  while 
looking  down. 


Figure  1 0.  Digital  Human  Head  Set  to  Fixate  on 
One-Inch  Cube 

MOTION  DATA  FILE  FORMAT,  CHANNELSETS  -  As 
discussed  in  the  document  “JACK/TTES:  A  System  for 
Production  and  Real-time  playback  of  Human  Figure 
Motion  in  a  DIS  Environment”  [2],  a  channel  is  defined  as 
storage  for  any  time-varying  parameter.  A  channelset  and 
a  sharedchannel  hold  the  same  data  as  a  channel, 
except  they  are  not  bound  to  a  specific  object.  A 
channelset  organizes  a  set  of  sharedchannels  together, 
giving  them  a  name  and  three  other  parameters  we  will 
discuss  later.  A  sharedchannel,  which  is  subject  to  the 
channelset  parameters,  introduces  three  more 
parameters  and  contains  the  entire  motion  data,  in  a 
frame  format  (i.e.,  N  number  of  frames  at  a  frame  rate  of 
so  many  frames  per  second). 

Channelset  File  Naming  Convention  -  Channelset  files  in 
the  EAI-Jack  environment  are  one  method  of  storing 
motion  data  resulting  from  a  motion  capture  session. 
The  problem  with  the  channelset  file  is  the  naming 
convention,  specifically  the  file  extension  required  when 
saving  the  data  to  your  hard  drive.  Channelset  motion 
data  files  use  the  “.env”  (environment)  extension. 
Anyone  that  is  familiar  with  EAI-Jack  file  types  knows  that 
the  “.env”  file  is  primarily  used  when  building  the  virtual 
world  (i.e.,  locating  objects,  attaching  textures,  defining 
colors,  etc.).  This  can  be  confusing,  if  the  user  does  not 
include  “motion”  in  the  file  name  when  saving  motion 
data,  and  should  be  used  as  a  standard  operating 
procedure  for  naming  motion  data  files. 

Channelset  Motion  Data  File  Types  -  There  are  three 
basic  types  of  channelset  motion  data  files  that  can  be 
saved  when  doing  a  motion  capture  session.  They  are: 
human  motion  data,  object  motion  data,  and  a 
combination  of  both  human  and  object  motion  data.  The 
difference  between  the  human  and  the  object  motion 
data  is  that  besides  capturing  the  position  and  orientation 
for  the  root  site,  the  digital  human  file  also  contains  all  of 
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the  joint  angles  for  the  68  joints  defined  in  the  EAI-Jack 
digital  human.  These  68  joints  are  included  in  the 
channelset  file  as  68  sharedchannels,  which  will  be 
discussed  later. 

Channelset  Motion  Data  File  Syntax  -  Channelset 
Motion  data  files  are  ASCII  text  based.  The  benefit  here 
is  that  the  file  can  be  viewed,  edited,  and  position  data 
can  be  extracted,  by  using  a  standard  ASCII  text  editor. 
The  first  line  of  a  channelset  file  defines  the  channelset 
name.  Later,  during  animation  development,  in  the  EAI- 
Jack  animation  system  module,  this  is  the  name  that 
appears  on  the  pull-down  menu  for  “select  channelset”, 
when  using  the  “creating  channelset  motion”  (1) 
command.  The  open  bracket  ({)  defines  the  beginning  of 
the  channelset  data,  with  the  close  bracket  placed  after 
the  last  line  of  data  contained  in  the  channelset  file. 

channelset  Channelset_Data_Name  {  (1) 

The  next  three  lines  of  a  channelset  file,  define  the  three 
parameters  mentioned  earlier.  These  parameters  are 
either  specified  during  the  setup  for  the  motion  capture 
session  or  are  generated  as  a  result  of  the  system 
hardware  configurations.  The  parameters  are  as  follows: 
the  number  of  frames  stored  in  each  sharedchannel  in 
the  set  (2),  the  frames  per  second  (fps)  that  the  set  was 
saved  at  during  the  capture  (3),  and  the  number  of 
shared  channels  in  the  set  (4).  The  semicolon  informs 
the  motion  data  parser  that  information  follows,  and  is 
included  on  every  line  of  motion  data  up  to  the  last  line. 
The  motion  data  parser  is  internal  to  the  EAI-Jack 
environment  and  interpolates  the  frame  data  and  binds 
the  specific  joints  contained  in  a  channelset  file  to  the 
selected  digital  human  prior  to  playback. 


size  =  28; 

(2) 

fps  =  3; 

(3) 

count  =  68; 

(4) 

One  problem  experienced  with  the  playback  of  the 
channelset  file  results  from  the  difference  in  the  default 
playback  of  30  fps  to  the  rate  of  capture.  Usually  around 
4  fps  for  a  Serial  configuration  and  baud  rate  around 
38,400  bps.  EAI-Jack  has  been  working  to  solve  this 
problem,  and  is  looking  at  approaches  that  will  provide 
the  user  with  the  flexibility  to  vary  the  rate  of  playback. 
Now  that  the  parameters  for  the  channelset  file  have 
been  set,  the  next  line  in  the  file  specifies  the  first 
sharedchannel  (5).  If  this  is  an  object  motion  channelset 
file,  this  will  be  the  only  shared  channel.  If  this  is  a 
human  motion  channelset,  it  will  be  the  first  of  68 
sharedchannels.  For  both  cases  the  first  sharedchannel 
for  the  human  or  object  motion  file,  will  be  of  the  type 
“sharedfigure”.  The  frame  data  for  sharedchannels  of 
this  type  include  the  root  site  for  the  figures  and  contain 
both  position  and  orientation  of  these  sites.  Before 
moving  on,  a  little  discussion  on  the  root  site  is 


necessary.  The  root  site  for  a  digital  human,  is  the  site 
that  controls  the  figures  position  and  orientation  within 
the  virtual  world. 

sharedchannel  sharedchannel_name  {  (5) 

By  moving  the  root  site  you  move  the  figure.  All  joint 
rotations  for  the  figure  tree  are  relative  to  the  root  site, 
with  the  captured  posture,  for  each  frame,  defined  by 
combining  the  joint  angles  for  each  segment  as  you  move 
out  along  the  figure  tree  away  from  the  root  site.  This 
provides  the  ability  to  go  from  a  standing  to  a  seated 
posture  and  for  example  hold  the  hands  at  a  fixed  point  in 
space,  or  move  the  figure  to  a  new  location  and  maintain 
a  fixed  posture.  For  an  object,  the  root  site  is  the  base 
site  and  changes  in  position  and  orientation  of  the  object 
are  the  result  of  directly  manipulating  this  site.  This 
sharedchannel  and  all  other  sharedchannels  contained  in 
the  channelset  file  have  three  parameters  and  one  data 
field  associated  with  them.  The  first  parameter,  we 
mentioned  above,  and  is  used  to  specify  the  “type”  of 
data  the  sharedchannel  contains.  There  two  options 
here,  either  sharedfigure,  as  shown  (6),  or  sharedjoint. 
Sharedfigure  specifies  that  the  figure  location  data  is 
stored  and  the  sharedjoint  specifies  that  joint  angle  data 
is  stored. 


type  =  "sharedfigure"; 

(6) 

protofiletype  =  "human-5. 9.fig"; 

(7) 

object  =  "human_5_9"; 

(8) 

When  working  in  earlier  versions  of  Jack,  version  1.2  or 
older,  the  protofiletype  (7)  specified  what  digital  human 
figure  or  object  was  required  for  playback.  In  effect,  it 
specified  the  only  file  that  could  be  bound  to  the 
channelset  file  during  playback.  Under  the  newer 
versions,  2.2  and  greater,  this  limitation  has  been 
removed  and  the  only  constraint  is  that  the  figure  be  of  a 
similar  nature  (i.e.,  digital  human,  same  number  of  joints, 
with  similar  site  names).  The  “object”  parameter  (8),  has 
two  uses,  in  the  case  of  a  “sharedfigure”  channel,  this  is  a 
string  that  holds  the  name  of  the  figure  used  to  generate 
the  data.  In  the  case  of  the  “sharedjoint”  channel,  it  is  the 
name  of  the  specific  joint  and  is  used  to  locate  the  joint 
for  binding  data  too.  The  last  bit  of  information,  the  data 
field,  included  in  the  sharedchannel  is  the  frame  data. 
The  frame  is  where  all  of  the  motion  data  information  is 
stored  in  the  file.  One  should  note  here,  that  for  a  motion 
capture  session  that  lasts  several  minutes  (i.e.,  120 
seconds  for  example),  at  a  rate  of  4  fps,  each  shared 
channel  stores  820  frames  of  data.  For  a  digital  human 
file,  with  68  sharedchannels,  this  equated  to  at  least 
55,000  lines  of  data.  It  is  obvious  that  motion  data  files 
can  become  quite  large  in  a  very  short  time  period.  As 
mentioned  earlier,  for  a  sharedfigure  channel  the  frame 
contains  the  root  site,  the  orientation,  in  degrees,  and  the 
position,  in  centimeters  (cm).  For  the  digital  human,  this 
is  the  first  sharedchannel,  and  the  only  sharedchannel  of 
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this  type.  For  an  object  motion  data  file,  the  sharedfigure 
type  is  the  only  sharedchannel  contained  in  the  file.  Due 
to  the  length  of  the  frame  data  for  this  type,  an  example  of 
the  data  will  not  be  shown.  The  remainder  of  the 
sharedchannels  for  a  digital  human  file,  as  mentioned 
earlier,  is  of  the  type  “sharedjoint”.  The  frame  data  for 
this  type  contain  only  joint  angle  data,  and  the  number  of 
angles  generated  depends  upon  the  degrees  of  freedom 
for  each  joint.  The  joint  angles  that  are  written  to  each 
frame  for  the  sharedchannel  are  specified  in  radians.  A 
point  of  interest  here,  during  one  phase  of  the  research, 
joint  angle  data  for  the  right  hand,  as  part  of  an  extensive 
cut  and  paste  editing  session,  was  moved  to  the  left 
hand’s  sharedchannel.  The  result  was  motion  data  for 
the  left  hand  that  mirrored  the  movement  of  the  right 
hand.  It  is  directly  the  result  of  this  work  that  initiated  the 
request  for  the  development  or  integration  of  motion  data 
editors  into  the  EAI-Jack  environment. 

BENEFITS  OF  HUMAN  MOTION  CAPTURE  -  Human 
motion  capture  that  is  used  to  provide  human  motion 
data  to  virtual  simulations  containing  digital  human 
models  is  quickly  becoming  a  valuable  tool  for  both  the 
U.S.  Army  TACOM  and  Delphi  Automotive  Systems.  The 
uses  of  human  motion  capture  will  increase  as  the 
technology,  accuracy,  and  supporting  software  improves. 
The  present  uses,  however,  of  human  motion  capture  for 
use  in  virtual  simulations  are  still  significant.  These 
human  motion  capture  and  analysis  applications  are 
detailed  in  the  following  sections. 

Product  Engineering  And  Design  -  Human  motion 
capture  for  use  within  virtual  simulations  can  be  an 
invaluable  tool  for  digital  product  mock-ups  used  during 
the  product  design  process.  The  products  may  vary 
between  the  two  users,  automotive  interiors  for  Delphi, 
and  military  vehicle  crew  stations  for  TACOM,  but  the 
human  motion  capture  applications  remain  consistent. 
When  a  human  motion  capture  session  is  conducted 
within  a  virtual  design  mock-up  or  physical  design 
representation,  the  instrumented  person  interacts  with 
the  product  to  validate  the  design  criteria.  These  criteria 
include  ergonomic  reach,  human  body  to  product 
interference,  maneuverability  within  vehicle,  sight  lines  to 
instruments  or  vehicle  exterior,  ergonomic  comfort, 
ingress/egress,  and  ergonomic  space  requirements.  The 
virtual  product  design  is  validated  using  a  digital  human 
model  that  can  apply  the  captured  human  motion  data. 
The  digital  human  interacts  with  the  virtual  product  mock- 
up  while  the  ergonomic  analysis  software  analyzes  the 
ergonomic  criteria  for  a  full  range  of  humans  (typically  5th 
to  95th  percentile).  The  product  design  is  modified  to 
optimize  ergonomic  interaction  in  this  time  and  cost 
saving  manner.  Using  virtual  product  mock-ups  and 
virtual  ergonomic  simulation  and  analysis  rather  than 
manually  analyzing  ergonomic  criteria  using  physical 
product  mock-ups  saves  time  and  money.  Time  is  also 
saved  when  creating  the  ergonomic  simulations.  For 
example,  human  motion  capture  data  can  be  produced  in 
the  same  amount  of  time  needed  by  a  person  to  perform 


the  operation.  The  length  of  time  to  create  the  same 
motions  using  the  ergonomic  software  tools  is 
significantly  longer.  For  example,  the  time  required  to 
manually  script  a  simple  lifting  exercise,  where  a  one-foot 
square  block  was  raised  from  one  shelf  to  a  higher  one, 
took  three  hours  of  programming  to  accomplish.  The 
same  exercise  using  motion  capture,  including  the  time  to 
instrument  a  person  and  perform  several  motion  capture 
iterations,  took  20  minutes. 

Manufacturing  Engineering  -  The  applications  of  human 
motion  capture  for  manufacturing  engineering  is  similar  to 
the  previous  product  engineering  design  discussion.  The 
difference  is  the  objects  that  the  human  interacts  with.  In 
this  case,  an  instrumented  human  interface  with  a 
manufacturing  work  area,  machine,  table,  station,  or  any 
other  manufacturing  related  equipment.  The 
manufacturing  process  is  analyzed  using  the  captured 
human  motion  data  applied  to  a  digital  human  contained 
in  a  virtual  manufacturing  work  environment.  Through 
ergonomic  analysis  software,  the  processes  and 
captured  human  motions  are  analyzed  using  digital 
human  models,  5th  to  95th  percentile,  to  determine  if 
ergonomic  design  criteria  are  met.  The  ergonomic 
criteria  ensures  the  safety  of  the  workers  by  checking 
reach  issues,  fatigue,  energy  expenditure,  posture 
analysis,  repetitive  motion,  strength  analysis,  lifting 
analysis,  etc.  In  this  application,  analyzing 
manufacturing  equipment  and  process  designs  before 
building  costly  models  and  equipment  saves  time  and 
money.  Improvements  can  be  made  if  the  process  is 
analyzed  using  digital  human  models  that  are 
manipulated  using  captured  human  motion  data.  For 
example,  a  videotape  of  a  process  believed  to  be 
stressful  on  the  knees  was  used  to  choreograph  a  motion 
capture  session.  The  resulting  simulation  and  the 
subsequent  analysis  flagged  the  suspect  posture  and 
provided  valuable  insight. 

Maintenance  Process  Design  -  In  this  scenario, 
maintenance  operations  can  be  evaluated  by  capturing 
human  motion  data  while  a  person  interacts  with  a  digital 
product  mock-up.  By  evaluating  the  product  accessibility 
using  an  instrumented  human  operating  on  a  virtual 
product  mock-up,  design  changes  can  be  made  before 
the  design  is  finalized  and  physically  built.  Ergonomic 
issues  with  maintenance  operations  can  be  evaluated 
with  a  full  range  of  digital  humans  (typically  5th  to  95th 
percentile)  within  a  virtual  environment.  As  a  result, 
product  design  changes  can  be  made  earlier  in  the 
design  process,  thus  avoiding  costly  and  time  consuming 
design  changes  later  in  the  design  process. 

Process  Training  -  Human  motion  capture  can  also  be 
applied  as  a  training  tool.  Rather  than  produce  a  written 
procedure  documenting  an  operation,  a  video  clip  can  be 
produced  using  an  ergonomic  software  package.  An 
instrumented  person  would  perform  an  operation  while 
the  motions  are  captured  and  stored.  The  human  motion 
data  would  then  be  applied  to  a  digital  human  model 
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placed  within  a  virtual  simulation  environment.  The 
entire  simulation  could  then  be  output  into  a  movie  clip 
format  that  could  be  shipped  to  various  sites  to  visually 
convey  the  detailed  operation,  thus,  reducing  the 
interpretation  errors  that  could  result  from  a  written 
operation  description.  Another  potential  approach  is  to 
take  the  digital  simulation,  export  it  to  either  the  Virtual 
Reality  Modeling  Language  (VRML)  or  a  Java3D  as  a 
web  based  animation.  This  would  provide  the  ability  to 
share  the  simulation  over  a  local  Intranet  or  via  the 
Internet  and  provide  a  remote  viewer  the  ability  to  zoom 
in/out  or  change  the  viewing  angle  as  the  simulation 
progresses.  This  would  provide  the  remote  viewer  the 
capability  to  focus  on  critical  areas  of  the  animation 
sequence. 

CONCLUSION 

The  intent  in  writing  this  paper,  was  the  development  of  a 
document  that  could  be  used  as  a  reference  by  those 
organizations  just  getting  started  in  the  field  of  motion 
capture  and  digital  human  simulation.  The  basis  for  this 
document  are  the  lessons  learned  during  the  cooperative 
research  between  TACOM  and  Delphi.  This  paper 
potentially  provides  alternative  approaches  to  those 
organizations  doing  similar  work,  but  focusing  on  different 
corporate  goals  or  objectives. 

As  stated  in  the  introduction,  the  focus  was  on  the 
electromagnetic  motion  capture  hardware  configuration 
for  the  two  organizations.  Discussion  topics  covered 
Ascension’s  different  systems,  hardware  placement,  field 
integrity,  field  orientation  (i.e.,  hemisphere),  and  the 
effects  of  the  different  operating  modes  on  field  strength. 
The  focus  then  shifted  to  the  different  options  for  viewing 
the  virtual  world  (i.e.,  CAVE  vs.  HMD).  The  problems 
encountered  with  each  configuration,  and  the  unique 
system  traits,  discovered  as  a  result  of  this  research, 
were  also  addressed.  The  discussion  then  introduced 
the  interaction  and  setup  of  the  EAI-Jack  digital  human 
with  the  motion  capture  hardware.  Topics  touched  on 
included:  transmitter  location,  site  position  and 
orientation,  the  number  of  sensors  required,  and  an 
alternative  approach  to  controlling  the  digital  human’s 
head.  Moving  on,  the  discussion  focused  on  the 
channelset  motion  data  file,  breaking  it  down  to  the 
individual  syntax  for  the  data  contained  in  the  file.  Finally, 
the  discussion  focused  on  the  potential  benefits  resulting 
from  the  combined  use  of  motion  capture  and  digital 
humans,  from  the  perspective  of  the  two  different 
organizations. 

So  what’s  next?  Last  year  the  optical  motion  capture 
industry  demonstrated  real  time  motion  capture  and 
playback.  The  significance  here  is  the  benefit  that  the 
commercial  entertainment  industry  has  recognized  from 
the  start.  This  benefit  is  that  no  tethers  are  needed  to 
connect  the  motion  sensors  to  the  human  body  during 
motion  capture.  Up  until  this  point  all  optical  motion 
capture  data  had  to  be  post  processed  prior  to  playback, 
which  was  a  very  time  consuming  task,  requiring 


extensive  user  interaction  during  editing  to  eliminate  data 
spikes.  One  potential  benefit  is  the  ability  to  generate 
motion  data  within  a  large  space,  with  no  tethers  to  worry 
about,  and  by  integrating  an  HMD,  you  only  have  to  worry 
about  its  tether  when  immersed  in  a  virtual  world.  Optical 
systems  have  the  added  benefit  of  saving  motion  data 
files  in  the  standard  formats  used  by  the  entertainment 
industry  for  playback  in  commonly  used  commercial 
simulation  packages,  where  motion  data  editors  already 
exist.  The  current  drawback  is  that  software  drivers  have 
yet  to  be  developed  that  integrate  optical  motion  capture 
system  with  ergonomic  simulation  and  analysis 
environments  like  EAI-Jack.  The  other  area  touched 
upon  slightly  in  the  paper  and  alluded  to  here,  is  the 
complete  lack  of  motion  data  editors  for  the  channelset 
file  format.  One  potential  resource  is  the  development  of 
a  motion  capture  data  library,  where,  instead  of  capturing 
data,  existing  data  for  similar  processes  can  be  modified 
to  fit  the  task  being  analyzed.  Consequently,  new  motion 
capture  data  is  developed  only  when  the  human  motion 
required  to  accomplish  a  process  or  task  is  new  or 
unique.  In  closing,  the  authors  hope  that  this  discussion 
will  act  as  food  for  thought,  initiating  new  programs  and 
continued  research  in  these  areas. 
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DEFINITIONS,  ACRONYMS,  ABBREVIATIONS 

AC:  Alternating  Current 
bps:  bits  per  second 

CAVE:  CAVE  Automatic  Virtual  Environment 


cm:  centimeters 

CPU:  Central  Processing  Unit 

DC:  Direct  Current 

EAI:  Engineering  Animation  Incorporated 

ERT:  Extended  Range  Transmitter 

FOB:  Flock  of  Birds 

fps:  frames  per  second 

GUI:  Graphic  User  Interface 

HMD:  Head  Mounted  Display 

IPT:  Integrated  Process/Product  Team 

MCO:  Multi  Channel  Option 

SGI:  Silicon  Graphics 

TACOM:  Tank  Automotive  &  Armaments  Command 
TARDEC:  TACOM,  Research,  Development,  and 
Engineering  Center 

VRML:  Virtual  Reality  Modeling  Language 
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